93 research outputs found

    Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

    Full text link
    We present rectified flow, a surprisingly simple approach to learning (neural) ordinary differential equation (ODE) models to transport between two empirically observed distributions \pi_0 and \pi_1, hence providing a unified solution to generative modeling and domain transfer, among various other tasks involving distribution transport. The idea of rectified flow is to learn the ODE to follow the straight paths connecting the points drawn from \pi_0 and \pi_1 as much as possible. This is achieved by solving a straightforward nonlinear least squares optimization problem, which can be easily scaled to large models without introducing extra parameters beyond standard supervised learning. The straight paths are special and preferred because they are the shortest paths between two points, and can be simulated exactly without time discretization and hence yield computationally efficient models. We show that the procedure of learning a rectified flow from data, called rectification, turns an arbitrary coupling of \pi_0 and \pi_1 to a new deterministic coupling with provably non-increasing convex transport costs. In addition, recursively applying rectification allows us to obtain a sequence of flows with increasingly straight paths, which can be simulated accurately with coarse time discretization in the inference phase. In empirical studies, we show that rectified flow performs superbly on image generation, image-to-image translation, and domain adaptation. In particular, on image generation and translation, our method yields nearly straight flows that give high quality results even with a single Euler discretization step

    Passage-Mask: A Learnable Regularization Strategy for Retriever-Reader Models

    Full text link
    Retriever-reader models achieve competitive performance across many different NLP tasks such as open question answering and dialogue conversations. In this work, we notice these models easily overfit the top-rank retrieval passages and standard training fails to reason over the entire retrieval passages. We introduce a learnable passage mask mechanism which desensitizes the impact from the top-rank retrieval passages and prevents the model from overfitting. Controlling the gradient variance with fewer mask candidates and selecting the mask candidates with one-shot bi-level optimization, our learnable regularization strategy enforces the answer generation to focus on the entire retrieval passages. Experiments on different tasks across open question answering, dialogue conversation, and fact verification show that our method consistently outperforms its baselines. Extensive experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.Comment: EMNLP 202

    Certified Monotonic Neural Networks

    Full text link
    Learning monotonic models with respect to a subset of the inputs is a desirable feature to effectively address the fairness, interpretability, and generalization issues in practice. Existing methods for learning monotonic neural networks either require specifically designed model structures to ensure monotonicity, which can be too restrictive/complicated, or enforce monotonicity by adjusting the learning process, which cannot provably guarantee the learned model is monotonic on selected features. In this work, we propose to certify the monotonicity of the general piece-wise linear neural networks by solving a mixed integer linear programming problem.This provides a new general approach for learning monotonic neural networks with arbitrary model structures. Our method allows us to train neural networks with heuristic monotonicity regularizations, and we can gradually increase the regularization magnitude until the learned network is certified monotonic. Compared to prior works, our approach does not require human-designed constraints on the weight space and also yields more accurate approximation. Empirical studies on various datasets demonstrate the efficiency of our approach over the state-of-the-art methods, such as Deep Lattice Networks

    Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision

    Full text link
    We consider the post-training quantization problem, which discretizes the weights of pre-trained deep neural networks without re-training the model. We propose multipoint quantization, a quantization method that approximates a full-precision weight vector using a linear combination of multiple vectors of low-bit numbers; this is in contrast to typical quantization methods that approximate each weight using a single low precision number. Computationally, we construct the multipoint quantization with an efficient greedy selection procedure, and adaptively decides the number of low precision points on each quantized weight vector based on the error of its output. This allows us to achieve higher precision levels for important weights that greatly influence the outputs, yielding an 'effect of mixed precision' but without physical mixed precision implementations (which requires specialized hardware accelerators). Empirically, our method can be implemented by common operands, bringing almost no memory and computation overhead. We show that our method outperforms a range of state-of-the-art methods on ImageNet classification and it can be generalized to more challenging tasks like PASCAL VOC object detection.Comment: Accepted by AAAI202

    Diffusion-based Molecule Generation with Informative Prior Bridges

    Full text link
    AI-based molecule generation provides a promising approach to a large area of biomedical sciences and engineering, such as antibody design, hydrolase engineering, or vaccine development. Because the molecules are governed by physical laws, a key challenge is to incorporate prior information into the training procedure to generate high-quality and realistic molecules. We propose a simple and novel approach to steer the training of diffusion-based generative models with physical and statistics prior information. This is achieved by constructing physically informed diffusion bridges, stochastic processes that guarantee to yield a given observation at the fixed terminal time. We develop a Lyapunov function based method to construct and determine bridges, and propose a number of proposals of informative prior bridges for both high-quality molecule generation and uniformity-promoted 3D point cloud generation. With comprehensive experiments, we show that our method provides a powerful approach to the 3D generation task, yielding molecule structures with better quality and stability scores and more uniformly distributed point clouds of high qualities

    Neural Volumetric Mesh Generator

    Full text link
    Deep generative models have shown success in generating 3D shapes with different representations. In this work, we propose Neural Volumetric Mesh Generator(NVMG) which can generate novel and high-quality volumetric meshes. Unlike the previous 3D generative model for point cloud, voxel, and implicit surface, the volumetric mesh representation is a ready-to-use representation in industry with details on both the surface and interior. Generating this such highly-structured data thus brings a significant challenge. We first propose a diffusion-based generative model to tackle this problem by generating voxelized shapes with close-to-reality outlines and structures. We can simply obtain a tetrahedral mesh as a template with the voxelized shape. Further, we use a voxel-conditional neural network to predict the smooth implicit surface conditioned on the voxels, and progressively project the tetrahedral mesh to the predicted surface under regularizations. The regularization terms are carefully designed so that they can (1) get rid of the defects like flipping and high distortion; (2) force the regularity of the interior and surface structure during the deformation procedure for a high-quality final mesh. As shown in the experiments, our pipeline can generate high-quality artifact-free volumetric and surface meshes from random noise or a reference image without any post-processing. Compared with the state-of-the-art voxel-to-mesh deformation method, we show more robustness and better performance when taking generated voxels as input

    Molecular subgroups of adult medulloblastoma: a long-term single-institution study

    Get PDF
    Background Recent transcriptomic approaches have demonstrated that there are at least 4 distinct subgroups in medulloblastoma (MB); however, survival studies of molecular subgroups in adult MB have been inconclusive because of small sample sizes. The aim of this study is to investigate the molecular subgroups in adult MB and identify their clinical and prognostic implications in a large, single-institution cohort. Methods We determined gene expression profiles for 13 primary adult MBs. Bioinformatics tools were used to establish distinct molecular subgroups based on the most informative genes in the dataset. Immunohistochemistry with subgroup-specific antibodies was then used for validation within an independent cohort of 201 formalin-fixed MB tumors, in conjunction with a systematic analysis of clinical and histological characteristics. Results Three distinct molecular variants of adult MB were identified: the SHH, WNT, and group 4 subgroups. Validation of these subgroups in the 201-tumor cohort by immunohistochemistry identified significant differences in subgroup-specific demographics, histology, and metastatic status. The SHH subgroup accounted for the majority of the tumors (62%), followed by the group 4 subgroup (28%) and the WNT subgroup (10%). Group 4 tumors had significantly worse progression-free and overall survival compared with tumors of the other molecular subtypes. Conclusions We have identified 3 subgroups of adult MB, characterized by distinct expression profiles, clinical features, pathological features, and prognosis. Clinical variables incorporated with molecular subgroup are more significantly informative for predicting adult patient outcome

    Effects of Normal Stress and Joint Inclination Angle on Rock Failure Characteristics Under Compression–Shear Conditions

    Get PDF
    In this study, cement mortar was used to make specimens containing groups of parallel joints with different inclination angles to simulate natural rock mass, and the specimens were subjected to shear tests under different normal stresses. By analyzing the crack propagation path, failure modes, and strength characteristics of these rock specimens, the effects of normal stress and joint inclination angles on the strength and failure characteristics of this type of rock mass were studied. The following conclusions are drawn: 1) when the inclination angles of the joints are 0° and 15°, the changing of the normal stress did not affect the failure mode of the rock mass. The rock mass was mainly in the mode of shear failure, and the increase in the normal stress only increased the spalling area of the rock mass. 2) When the inclination angles of the joints are 30°, 45°, and 60°, with the increasing of the normal stress, the number of those approximately parallel cracks in the specimens increased, the friction marks caused by shearing increased, and the failure mode of the rock mass changed from tension failure to tension–shear composite failure. 3) Under different joint inclination angles, the propagation and penetration paths of cracks generated in the rock mass and the damage mode of the rock mass were different. With an increase in the joint inclination angles, the damage mode of the rock mass gradually changes from shear damage to tensile–shear composite damage and the α and β angles between the through cracks and the vertical direction on the left and right sides of the specimens tended to decrease. 4) The shear resistance of the rock mass was affected by the inclination angle of the joints and the normal pressure. The shear resistance of rock mass was improved due to the increasing of normal stress. Within a certain range, with the increasing of the inclination angles of the joint, the shear resistance of the rock mass tended to decrease first and then to increase

    Altered Regional and Circuit Resting-State Activity Associated with Unilateral Hearing Loss

    Full text link
    The deprivation of sensory input after hearing damage results in functional reorganization of the brain including cross-modal plasticity in the sensory cortex and changes in cognitive processing. However, it remains unclear whether partial deprivation from unilateral auditory loss (UHL) would similarly affect the neural circuitry of cognitive processes in addition to the functional organization of sensory cortex. Here, we used resting-state functional magnetic resonance imaging to investigate intrinsic activity in 34 participants with UHL from acoustic neuroma in comparison with 22 matched normal controls. In sensory regions, we found decreased regional homogeneity (ReHo) in the bilateral calcarine cortices in UHL. However, there was an increase of ReHo in the right anterior insular cortex (rAI), the key node of cognitive control network (CCN) and multimodal sensory integration, as well as in the left parahippocampal cortex (lPHC), a key node in the default mode network (DMN). Moreover, seed-based resting–state functional connectivity analysis showed an enhanced relationship between rAI and several key regions of the DMN. Meanwhile, lPHC showed more negative relationship with components in the CCN and greater positive relationship in the DMN. Such reorganizations of functional connectivity within the DMN and between the DMN and CCN were confirmed by a graph theory analysis. These results suggest that unilateral sensory input damage not only alters the activity of the sensory areas but also reshapes the regional and circuit functional organization of the cognitive control network
    corecore